Exact coalescent likelihoods for unlinked markers in finite-sites mutation models
نویسندگان
چکیده
We derive exact formulae for the allele frequency spectrum under the coalescent with mutation, conditioned on allele counts at some fixed time in the past. We consider unlinked biallelic markers mutating according to a finite sites, or infinite sites, model. This work extends the coalescent theory of unlinked biallelic markers, enabling fast computations of allele frequency spectra in multiple populations. Our results have applications to demographic inference, species tree inference, and the analysis of genetic variation in closely related species more generally.
منابع مشابه
Inferring species trees directly from biallelic genetic markers: bypassing gene trees in a full coalescent analysis.
The multispecies coalescent provides an elegant theoretical framework for estimating species trees and species demographics from genetic markers. However, practical applications of the multispecies coalescent model are limited by the need to integrate or sample over all gene trees possible for each genetic marker. Here we describe a polynomial-time algorithm that computes the likelihood of a sp...
متن کاملCoalescent: an open-science framework for importance sampling in coalescent theory
Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schem...
متن کاملComputation of the Likelihood in Biallelic Diffusion Models Using Orthogonal Polynomials
In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic methods of inference can be shown to be optimal, if their assumptions are...
متن کاملHybridsim: Simulator for generating allele data in isolated populations
We propose a novel two-phase population simulator for generating diploid marker allele data in isolated populations. Our simulator extends Populus, an exact forward-in-time simulator for isolated populations, with a coalescent simulator. The coalescent, while not suitable for completely replacing Populus, is an useful substitute for the over-simpli ed founder generator used in Populus. We belie...
متن کاملMulti-locus match probability in a finite population: a fundamental difference between the Moran and Wright–Fisher models
MOTIVATION A fundamental problem in population genetics, which being also of importance to forensic science, is to compute the match probability (MP) that two individuals randomly chosen from a population have identical alleles at a collection of loci. At present, 11-13 unlinked autosomal microsatellite loci are typed for forensic use. In a finite population, the genealogical relationships of i...
متن کامل